Goto

Collaborating Authors

 first-order regret


Improved Algorithms for Online Submodular Maximization via First-order Regret Bounds

Neural Information Processing Systems

We consider the problem of nonnegative submodular maximization in the online setting. At time step t, an algorithm selects a set St C 2V where C is a feasible family of sets. An adversary then reveals a submodular function ft. The goal is to design an efficient algorithm for minimizing the expected approximate regret. In this work, we give a general approach for improving regret bounds in online submodular maximization by exploiting "first-order" regret bounds for online linear optimization. For monotone submodular maximization subject to a matroid, we give an efficient algorithm which achieves a (1 c/e ε)-regret of O( p kTln(n/k)) where n is the size of the ground set, k is the rank of the matroid, ε > 0 is a constant, and cis the average curvature. Even without assuming any curvature (i.e., taking c = 1), this regret bound improves on previous results of Streeter et al. (2009) and Golovin et al. (2014). For nonmonotone, unconstrained submodular functions, we give an algorithm with 1/2-regret O( nT), improving on the results of Roughgarden and Wang (2018). Our approach is based on Blackwell approachability; in particular, we give a novel first-order regret bound for the Blackwell instances that arise in this setting.



EfficientFirst-OrderContextualBandits: Prediction,Allocation,andTriangularDiscrimination

Neural Information Processing Systems

On the technical side, we show that the logarithmic loss and an informationtheoretic quantity called thetriangular discriminationplay a fundamental role in obtaining first-order guarantees, and we combine this observation with new refinements tothe regression oracle reduction framework ofFoster and Rakhlin [29].



TightFirst-andSecond-OrderRegretBounds forAdversarialLinearBandits

Neural Information Processing Systems

In addition, we need only assumptions weaker than those of existing algorithms; our algorithms work on discrete action sets as well as continuous ones without apriori knowledge about losses, and theyrun efficiently ifalinear optimization oracle for the action set is available.






Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Neural Information Processing Systems

Contextual bandits encompass both the general problem of statistical learning with function approximation (specifically, cost-sensitive classification) and the classical multi-armed bandit problem, yet present algorithmic challenges greater than the sum of both parts.